EPPS 6356 Data Visualization Project

This storyboard delivers our final project product by visualizing Formula 1 Racing data focused on analyzing information on the drivers and different circuits.

Data showcase 1: Which factors are important to determine the best driver?


Multiple linear regression was utilized to determine which factors are important to evaluate the best driver.

Wins = b0 + (Pole Wins) X1 + (Total Points) X2 + (Fastest Laps) X3 + (Podiums) X4 + (1st WDC Age) X5 + e
Note: b0 is the intercept of the regression line and e is the model error (residuals) or the variation in the model

R^2 = 0.9904, p-value = 5.769e-06

All factors were significant except Fastest Laps and Age. Tried to evaluate the height factor, however, the p-value was truly not significant since the p-value was 0.72785.

The residual values are not completely normally distributed. This histogram is skewed a bit at the ends. In the normal Q-Q plot, the normality appears to be more clear because the values follow a straight line.

Data showcase 2: Circuit Data


Different drivers’ race win records across various circuits are presented here. The legend on the bottom lists the circuits in the data set and then the bar chart visualizes the number of wins for each driver.

Data showcase 3: Wins per team by track since 2006


Data showcase 4: Wins per driver since 2006


Data showcase 5: Wins based on started grid position


Data showcase 6: F1 WDC Racer Ages


Data showcase 7: F1 WDC Racers and their Podium Wins


Data showcase 8: Average Pit Stop Time from 2011-2021


Data showcase 9: Total points in 2021 for different racers


Data showcase 10: 2021 F1 Teams Total Points


Data showcase 11: 2021 US grand prix qualifying


These 2 graphs are created using the “f1dataR” library!

Data showcase 12: Fernando Alonso’s record breaking distance!


Data showcase 13: Frequency of Max Verstappen’s Finishing Points


Data showcase 14: Frequency of Lewis Hamilton’s Finishing Points